Encontro 2 | 19/08/20024
Henrique Costa | Métodos Estratégicos em FinQuant
Até que ponto a solução exata do problema aproximado pode ser a solução aproximada do problema real?
A especificação do modelo deve ser uma simplificação do problema real.
Some simple graphics are easy to describe and may even have ready names.
A grammar of graphics will help us describe more complex graphics.
Graphics require data (e.g., tibbles), which describe observations using variables.
Graphics require aesthetic mappings, which connect data variables to visual qualities.
Graphics require scales, which connect specific data values to specific aesthetic values.
Graphics require geometric objects (geoms), which represent the observations.
+ rather than |># SETUP: We will need tidyverse and an example dataset
library(tidyverse)
mpg
# ==============================================================================
# LESSON: First, set the data to a tibble
p <- ggplot(data = mpg)
p
# ==============================================================================
# LESSON: Next, set the aesthetic mappings with aes()
p <- ggplot(data = mpg, mapping = aes(x = displ, y = hwy))
p
# ==============================================================================
# TIP: You can leave off the optional argument names
p <- ggplot(mpg, aes(x = displ, y = hwy))
p
# ==============================================================================
# LESSON: Next, set the positional scales
p <- ggplot(mpg, aes(x = displ, y = hwy)) +
scale_x_continuous(
name = "Engine Size (in liters)",
limits = c(1, 7),
breaks = 1:7
) +
scale_y_continuous(
name = "Highway Fuel Efficiency (in miles/gallon)",
limits = c(10, 50),
breaks = c(10, 20, 30, 40, 50)
)
p
# ==============================================================================
# LESSON: Finally, add a point geom
p <-
ggplot(mpg, aes(x = displ, y = hwy)) +
scale_x_continuous(
name = "Engine Size (in liters)",
limits = c(1, 7),
breaks = 1:7
) +
scale_y_continuous(
name = "Highway Fuel Efficiency (in miles/gallon)",
limits = c(10, 50),
breaks = c(10, 20, 30, 40, 50)
) +
geom_point()
# ==============================================================================
# TIP: If you leave off the scales, R will try to guess
p <- ggplot(mpg, aes(x = displ, y = hwy)) + geom_point()
p
# ==============================================================================
# LESSON: We can also customize the geom with arguments
p <- ggplot(mpg, aes(x = displ, y = hwy)) +
geom_point(color = "red", shape = "square", size = 2)
p# SETUP: We will need tidyverse and an example dataset
library(tidyverse)
mpg
# ==============================================================================
# USECASE: Add a smooth geom (i.e., line of best fit)
ggplot(mpg, aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth()
ggplot(mpg, aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth(method = "lm")
# ==============================================================================
# USECASE: Add a line geom (i.e., connecting points)
economics
ggplot(economics, aes(x = date, y = unemploy)) +
geom_point()
ggplot(economics, aes(x = date, y = unemploy)) +
geom_point() +
geom_line(color = "orange", size = 1)
ggplot(economics, aes(x = date, y = unemploy)) +
geom_line(color = "orange", size = 1) +
geom_point()
# ==============================================================================
# USECASE: Add reference line geoms
ggplot(economics, aes(x = date, y = unemploy)) +
geom_hline(yintercept = 0, color = "orange", size = 1) +
geom_line(color = "blue", size = 1) +
geom_point()
ggplot(economics, aes(x = date, y = unemploy)) +
geom_vline(xintercept = 7.5, color = "orange", size = 1) +
geom_line(color = "blue", size = 1) +
geom_point()
ggplot(economics, aes(x = date, y = unemploy)) +
geom_abline(intercept = 4000, slope = 0.5, color = "orange", size = 1) +
geom_line(color = "blue", size = 1) +
geom_point() # SETUP: We will need tidyverse and an example dataset
library(tidyverse)
mpg
# ==============================================================================
# USECASE: Continuous color scales work well with numeric variables
ggplot(mpg, aes(x = hwy, y = cty, color = displ)) +
geom_point(size = 4)
ggplot(mpg, aes(x = hwy, y = cty, color = displ)) +
geom_point(size = 4) +
scale_color_continuous(type = "viridis")
# ==============================================================================
# USECASE: Use a discrete color scale with categorical variables
ggplot(mpg, aes(x = displ, y = hwy, color = drv)) +
geom_point()
ggplot(mpg, aes(x = displ, y = hwy, color = drv)) +
geom_point() +
scale_color_discrete(
name = "Drivetrain",
breaks = c("4", "f", "r"),
labels = c("Four Wheel", "Front Wheel", "Rear Wheel")
)
# ==============================================================================
# PITFALL: Don't forget to set categorical variables as factors
ggplot(mpg, aes(x = displ, y = hwy, color = cyl)) +
geom_point() # R guesses you want a continuous scale
ggplot(mpg, aes(x = displ, y = hwy, color = factor(cyl))) +
geom_point() +
scale_color_discrete(name = "Cylinders")
# ==============================================================================
# LESSON: Set a geom's color aesthetic to make it always that color
ggplot(mpg, aes(x = displ, y = hwy)) +
geom_point(color = "red")
# ==============================================================================
# PITFALL: However, do this inside of geom() not aes()
ggplot(mpg, aes(x = displ, y = hwy, color = "blue")) +
geom_point() #unintended
# ==============================================================================
# LESSON: If you both set and map color, the setting will win
ggplot(mpg, aes(x = displ, y = hwy, color = drv)) +
geom_point(color = "blue") I
theme_apa()theme() and this referenceI
# SETUP: We will need tidyverse and an example graphic
library(tidyverse)
p <-
ggplot(mpg, aes(x = displ, y = hwy, color = drv)) +
geom_point() +
labs(title = "Fuel Efficiency")
p
# ==============================================================================
# USECASE: Apply a "complete" theme
p + theme_bw()
p + theme_classic()
p + theme_dark()
# ==============================================================================
# LESSON: More more precise control, we can use theme()
p + theme(legend.position = "top")
p + theme(plot.title = element_text(color = "purple", face = "bold"))
p + theme(panel.grid = element_blank())
# NOTE: There are a lot of elements to learn, so use a cheatsheet!I
ggsave()
.png for most daily purposes
.pdf or .svgI
# SETUP: We will need tidyverse and an example graphic
library(tidyverse)
p <- ggplot(mpg, aes(x = displ, y = hwy)) +
geom_point() + geom_smooth() +
labs(x = "Engine Displacement", y = "Highway MPG")
p
# ==============================================================================
# USECASE: Save a specific ggplot object to a file
ggsave(filename = "pfinal.png", plot = p)
# ==============================================================================
# LESSON: Specify the size of the file to create
ggsave(filename = "pfinal2.png", plot = p,
width = 6, height = 3, units = "in")
# ==============================================================================
# LESSON: Just change the extension to create a different file type
ggsave(filename = "pfinal2.pdf", plot = p,
width = 6, height = 3, units = "in")
# ==============================================================================
# PITFALL: Creating a very large file may lead to small text
ggsave(filename = "p_poster.png", plot = p,
width = 12, height = 8, units = "in")
# ==============================================================================
# TIP: You can quickly increase the text size using base_size
p2 <- p + theme_grey(base_size = 24)
ggsave(filename = "p_poster2.png", plot = p2,
width = 12, height = 8, units = "in")I